{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"http://hilpisch.com/tpq_logo.png\" alt=\"The Python Quants\" width=\"35%\" align=\"right\" border=\"0\"><br><br><br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listed Volatility and Variance Derivatives\n",
    "\n",
    "**Wiley Finance (2017)**\n",
    "\n",
    "Dr. Yves J. Hilpisch | The Python Quants GmbH\n",
    "\n",
    "http://tpq.io | [@dyjh](http://twitter.com/dyjh) | http://books.tpq.io\n",
    "\n",
    "<img src=\"http://hilpisch.com/../images/lvvd_cover.png\" alt=\"Derivatives Analytics with Python\" width=\"30%\" align=\"left\" border=\"0\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to Python"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python has become a powerful programming language and has developed a huge ecosystem of helpful libraries over the last couple of years. This chapter provides a concise overview of Python and two of the major pillars of the the so-called *scientific stack*:\n",
    "\n",
    "* NumPy (see http://numpy.org)\n",
    "* pandas (see http://pandas.pydata.org)\n",
    "\n",
    "NumPy provides performant array operations on numerical data while pandas is specifically designed to handle more complex data analytics operations, e.g. on (financial) times series data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Such an introductory chapter &mdash; only addressing selected topics relevant to the contents of this book &mdash; can of course not replace a thourough introduction to Python and the libraries covered. However, if you are rather new to Python or programming in general you might get a first overview and a feeling of what Python is all about. If you are already experienced in another language typically used in quantitative finance (e.g. Matlab, R, C++, VBA), you see how typical data structures, programming paradigms and idioms in Python look like.\n",
    "\n",
    "For a comprehensive overview of Python applied to finance see Hilpisch (2014). Other, more general introductions, to the language with a scientific and data analysis focus are Haenel et al. (2013), Langtangen (2009) and McKinney (2012).\n",
    "\n",
    "This chapter and the rest of the book is based on Python 2.7 although the majority of the code should be easily transformed to Python 3.5 after some minor modifications."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Python Basics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This section introduces basic Python data types and structures, control structures and some Python idioms."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data Types"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is noteworthy that Python is a *dynamically typed system* which means that types of objects are inferred from their contexts. Let us start with numbers. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a = 3  # defining an integer object\n",
    "type(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a.bit_length()  # number of bits used"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b = 5.  # defining a float object\n",
    "type(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python can handle arbitrarily large integers which is quite beneficial for number theoretical applications, for instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "c = 10 ** 100  # googol number\n",
    "type(c)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "c # long integer object"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "c.bit_length()  # number of bits used"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Arithmetic operations on these objects work as expected."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "3 / 5  # division"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a * b  # multiplication"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a - b  # difference"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b + a  # addition"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a ** b  # power"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Many oftenly used mathematical functions are found in the ``math`` module which is part of Python's standard library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import math  # importing the library into the namespace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "math.log(a)  # natural logarithm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "math.exp(a)  # exponential function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "math.sin(b)  # sine function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another important basic data type are string objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "s = 'Listed Volatility and Variance Derivatives.'\n",
    "type(s)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This object type has multiple methods attached."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "s.lower()  # converting to lower case characters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "s.upper()  # converting to upper case characters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Strings objects can be easily sliced. Note that Python has in general zero-based numbering and indexing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "s[0:6]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Such objects can also be combined using the ``+`` operator. The index value -1 represents the last character of a string (or last element of a sequence in general)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "st = s[0:6] + s[-13:-1]\n",
    "print(st)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "String replacements are often used to parametrize text output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "repl = \"My name is %s, I am %d years old and %4.2f m tall.\"\n",
    "## replace %s by a string, %d by an integer and\n",
    "## %4.2f by a float showing 2 decimal values\n",
    "print(repl % ('Peter', 35, 1.88)) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A different way to reach the same goal is the following."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "repl = \"My name is {:s}, I am {:d} years old and {:4.2f} m tall.\"\n",
    "print(repl.format('Peter', 35, 1.88))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data Structures"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A light weight data structure are tuples. These are immutable collections of other objects and are constucted by objects separated by commas &mdash; with or without parentheses. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "t1 = (a, b, st)\n",
    "t1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "type(t1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "t2 = st, b, a\n",
    "t2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "type(t2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Nested structures are also possible. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "t = (t1, t2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "t"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "t[0][2]  # take 3rd element of 1st element"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "List objects are mutable collections of other objects and are generally constructed by providing a comma separated collection of objects in brackets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "l = [a, b, st]\n",
    "l"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "type(l)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "l.append(s.split()[3])  # append 4th word of string\n",
    "l"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sorting is a typical operation on list objects which can also be constructed using the ``list`` constructor (here applied to a tuple object)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "l = list(('Z', 'Q', 'D', 'J', 'E', 'H'))\n",
    "l"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "l.sort()  # in-place sorting\n",
    "l"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Dictionary objects are so-called key-value stores and are generally constructed with curley brackets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "d = {'int_obj': a,\n",
    "    'float_obj': b,\n",
    "    'string_obj': st}\n",
    "type(d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "d['float_obj']  # look-up of value given key"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "d['long_obj'] = c / 10 ** 90  # adding new key value pair\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keys and values of a dictionary object can be retrieved as list objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "d.keys()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "d.values()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Control Structures"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Iterations are very important operations in programming in general and financial analytics in particular. Many Python objects are iterable which proves rather convenient in many circumstances. Consider the special list object constructor ``range``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "range(5)  # all integers from zero to 5 excluded"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "range(3, 15, 2)  # start at 3, step with 2 until 15 excluded"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Such a list object constructor is often used in the context of a ``for`` loop."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "for i in range(5):\n",
    "    print(i ** 2, end=' ')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, you can iterate over any sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "## iteration over list object\n",
    "for _ in l:\n",
    "    print(_, end=' ')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "## iteration over string object\n",
    "for c in st:\n",
    "    print(c + '|', end=' ')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``while`` loops are similar to their counterparts in other languages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "i = 0  # initialize counter\n",
    "while i < 5:\n",
    "    print(i ** 0.5, end=' ')  # output\n",
    "    i += 1  # increase counter by 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The ``if-elif-else`` control structure is introduced below in the context of Python function definitions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Special Python Idioms "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python in many places relies on a number of special idioms. Let us start with a rather popular one, the list comprehension."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "lc = [i ** 2 for i in range(10)]\n",
    "lc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As the name suggests, the result is a list object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "type(lc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So-called lambda or anonymous functions are usuful helpers in many places."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "f = lambda x: math.cos(x) # returns cos of x\n",
    "f(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "List comprehensions can be combined with lambda functions to achieve concise constructions of list objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "lc = [f(x) for x in range(10)]\n",
    "lc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, there is an even more concise way of constructing the same list object &mdash; using functional programming approaches, in the case to follow with ``map``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "list(map(lambda x: math.cos(x), range(10)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, one works with regular Python functions (as opposed to lambda functions) which are constructed as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "def f(x):\n",
    "    return math.exp(x)\n",
    "f(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The general construction looks like this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "def f(*args):  # multiple arguments\n",
    "    for arg in args:\n",
    "        print(arg)\n",
    "        # do something with arguments\n",
    "    return None  # return result(s) (not necessary)\n",
    "\n",
    "f(l)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider the following function definition which returns different values/strings based on an ``if-elif-else`` control structure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import random  # import random number library\n",
    "a = random.randint(0, 1000)  # draw random number between 0 and 1000\n",
    "print(\"Random number is %d\" % a)\n",
    "def number_decide(number):\n",
    "    if a < 10:\n",
    "        return \"Number is single digit.\"\n",
    "    elif 10 <= a < 100:\n",
    "        return \"Number is double digit.\"\n",
    "    else:\n",
    "        return \"Number is triple digit.\"\n",
    "number_decide(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A specialty of Python are generator objects. One constructor for such objects that is commonly used is ``xrange``. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "g = range(10)\n",
    "type(g)  # object type"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "g  # object instance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "for _ in g:\n",
    "    print(_, end=' ')  # integers are \"generated\" when needed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generator objects can in many scenarios replace (typical) list objects and have the major advantage that they are in general much more memory efficient. Consider a financial algorithm that requires 10 mn loops. Iterating over a list of integers from 0 to 9,999,999 is not efficient since the algorithm (in general) does not need to have all these numbers available at the same time. But this is what happens when using ``range`` for such a loop.\n",
    "\n",
    "Consider the following construction of the respective list object containing all integers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "%time r = list(range(10000000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This object consumes 80 MB (!) of RAM (10 mn times 8 bytes)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.getsizeof(r)  # size in bytes of object"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On the other hand, consider the analogous construction based on a generator (``range``) object. It is much, much faster since no memory has to be allocated, no list object has to be generated up-front, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "%time xr = range(10000000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Memory consumption is also much, much more efficient &mdash; 40 bytes compared to 80 MB."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "sys.getsizeof(xr)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, in practical applications the two can be used often interchangeably such that one should always resort to the more efficient alternative when possible. The following examples calculate &mdash; using the functional programming operation ``reduce`` &mdash; the sum of all integers from 0 to 9,999,999. Although in this case there is hardly a performance difference, the first operation requires 80 MB of memory while the second might only require less than 100 bytes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import reduce"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%time reduce(lambda x, y: x + y, list(range(1000000)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "%time reduce(lambda x, y: x + y, range(1000000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "More Pythonic (and faster in general) is to calculate the sum using the built-in ``sum`` function &mdash; in this case a significant performance advantage for the generator approach emerges."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%timeit sum(list(range(1000000)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "%timeit sum(range(1000000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is also a way of indirectly constructing a generator object, i.e. by the use of parentheses. The following code results in a generator object for the sine values of the numbers from 0 to 99."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "g = (math.sin(x) for x in range(100))\n",
    "g"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Yet another way of constructing a generator object is by a definition style that resembles the  standard function definition closely. The difference is that instead of the ``return`` statement, the ``yield`` statement is used."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def g(start, end):\n",
    "    while start <= end:\n",
    "        yield start  # yield \"next\" value\n",
    "        start += 1  # increase by one"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Usage then might be as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "go = g(15, 20)\n",
    "for _ in go:\n",
    "    print(_, end=' ')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## NumPy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Many operations in computational finance take place over (large) arrays of numerical data. NumPy is a Python  library that allows the efficient handling of and operation on such data structures. Although quite a mighty library with a wealth of functionality, it suffices for the purposes of this book to cover the basics of NumPy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The workhorse is the NumPy ``ndarray`` class which provides the data structure for n-dimensional, immutable array objects. You can generate an ``ndarray`` object e.g. out of a list object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a = np.array(range(24))\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The power of these objects lies in the management of n-dimensional data structures (e.g. matrices or cubes of data)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b = a.reshape((4, 6))\n",
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "c = a.reshape((2, 3, 4))\n",
    "c"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So-called *standard arrays* (in contrast to e.g. structured arrays) have a single ``dtype`` (i.e. NumPy data type). Consider the following operation which changes the ``dtype`` parameter of the ``b`` object to float."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b = np.array(b, dtype=np.float)\n",
    "b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A major strength of NumPy are *vectorized operations*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "2 * b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b ** 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also pass ``ndarray`` objects to lambda or standard Python functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "f = lambda x: x ** 2 - 2 * x + 0.5\n",
    "f(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In many scenarios, only a (small) part of the data stored in a ``ndarray`` object is of interest. NumPy supports basic and advanced slicing and other selection features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a[2:6]  # 3rd to 6th element"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b[2, 4]  # 3rd row, final (5th)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b[1:3, 2:4]  # middle square of numbers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Boolean operations are also supported in many places."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "## which numbers are larger than 10?\n",
    "b > 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "## only those numbers (flat) that are larger than 10\n",
    "b[b > 10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Furthermore, ``ndarray`` objects have multiple (convenience) methods already built in."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "a.sum()  # sum of all elements"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b.mean()  # mean of all elements"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b.mean(axis=0)  # mean along 1st axis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "b.mean(axis=1)  # mean along 2nd axis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "c.std()  # standard deviation for all elements"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly, there is a wealth of so-called *universal functions* that the NumPy library provides. Universal in the sense that they can be applied in general to NumPy ``ndarray`` objects and to standard numerical Python data types."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "np.sum(a)  # sum of all elements"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "np.mean(b, axis=0)  # mean alond 1st axis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "np.sin(b).round(2)  # sine of all elements (rounded)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "np.sin(4.5)  # sine of Python float object"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, you should be aware that applying NumPy universal functions to standard Python data types generally comes with a significant performance burden."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%time l = [np.sin(x) for x in range(100000)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import math\n",
    "%time l = [math.sin(x) for x in range(100000)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the vectorized operations from NumPy is faster than both of the above alternatives which result in list objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%time l = np.sin(np.arange(100000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we use the ndarray object constructor ``arange`` which yields an ``ndarray`` object of integers -- below a simple example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "ai = np.arange(10)\n",
    "ai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "ai.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using this constructor, you can also generate ``ndarray`` objects with different ``dtype`` attributes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "af = np.arange(0.5, 9.5, 0.5)  # start, end, step size\n",
    "af"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "af.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Useful in this context is also the ``linspace`` operator providing an ``ndarray`` object with evenly spaced numbers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "np.linspace(0, 10, 12)  # start, end, number of elements"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In financial analytics one often needs (pseudo-) random numbers. NumPy provides many functions to sample from different distributions. Those needed in this book are the standard normal distribution and the Poisson distribution. The respective functions are found in the sub-library ``numpy.random``. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.standard_normal(10) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.poisson(0.5, 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us generate a ``ndarray`` object which is a bit more \"realistic\" and with which we work in what follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(1000)  # fix the rng seed value\n",
    "data = np.random.standard_normal((5, 100))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Although this is a slightly larger array one cannot expect that the 500 numbers are indeed standard normally distributed in the sense that the first moment is 0 and the second moment is 1. However, at least this can be easily corrected."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "data.mean()  # should be 0.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "data.std()  # should be 1.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The correction is called moment matching and can be implemented with NumPy by vectorized operations. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "data = data - data.mean()  # correction for the 1st moment\n",
    "data = data / data.std()  # correction for the 2nd moment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "data.mean()  # now really close to 0.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "data.std()  # now really close to 1.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## matplotlib"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At this stage, it makes sense to introduce plotting with matplotlib, the plotting work horse in the Python ecosystem. We use matplotlib (see http://matplotlib.org) with the settings of another library throughout, namely seaborn (see http://stanford.edu/~mwaskom/software/seaborn/) &mdash; this results in a more modern plotting style."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt  # import main plotting library\n",
    "plt.style.use('seaborn')\n",
    "import matplotlib\n",
    "## set font to serif\n",
    "matplotlib.rcParams['font.family'] = 'serif'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A standard plot is the *line plot*. The result of the code below is shown in the following figure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 6));  # size of figure\n",
    "plt.plot(data.cumsum());  # cumulative sum over all elements "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Standard line plot with matplotlib.</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Multiple lines plots are also easy to generate (see the following figure). The operator ``T`` stands for the  transpose of the ``ndarray`` object (\"matrix\")."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 6));  # size of figure\n",
    "## plotting five cumulative sums as lines\n",
    "plt.plot(data.T.cumsum(axis=0), label='line');\n",
    "plt.legend(loc=0);  # legend in best location\n",
    "plt.xlabel('data point');  # x axis label\n",
    "plt.ylabel('value');  # y axis label\n",
    "plt.title('random series');  # figure title"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Multiple lines plot with matplotlib.</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other important plotting types are *histograms* and *bar charts*. A histogram for all 500 values of the ``data`` object is shown in the following figure. In the code, the ``flatten()`` method is used to generate a one-dimensional array from the two-dimensional one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 6));  # size of figure\n",
    "plt.hist(data.flatten(), bins=30);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Histrogram with matplotlib.</p>"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "Finally, consider the bar chart presented in the following figure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 6));  # size of figure\n",
    "plt.bar(np.arange(1, 12) - 0.25, data[0, :11], width=0.5);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Bar chart with matplotlib.</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To conclude the introduction to matplotlib consider the ordinary least squares (OLS) regression of the sample data displayed in the following figure. NumPy provides with the two functions ``polyfit()`` and ``polyval()`` convenience functions to implement OLS based on simple monomials, i.e. $x, x^2, x^3, ..., x^n$. For illustration purposes consider linear, quadratic and cubic OLS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "x = np.arange(len(data.cumsum()))\n",
    "y = data.cumsum()\n",
    "rg1 = np.polyfit(x, y, 1)  # linear OLS\n",
    "rg2 = np.polyfit(x, y, 2)  # quadratic OLS\n",
    "rg3 = np.polyfit(x, y, 3)  # cubic OLS"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "The following code and figure illustrates the regression results graphically."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 6));\n",
    "plt.plot(x, y, 'r', label='data');\n",
    "plt.plot(x, np.polyval(rg1, x), 'b--', label='linear');\n",
    "plt.plot(x, np.polyval(rg2, x), 'b-.', label='quadratic');\n",
    "plt.plot(x, np.polyval(rg3, x), 'b:', label='cubic');\n",
    "plt.legend(loc=0);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Linear, quadratic and cubic regression."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## pandas"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "pandas is a library with which you can manage and operate on time series data and other tabular data structures. It allows to implement even sophisticated data analytics tasks on larger data sets. While the focus lies on in-memory operations, there are also multiple options for out-of-memory (on-disk) operations. Although pandas provides a number of different data structures, embodied by powerful classes, the most oftenly used structure is the ``DataFrame`` class which resembles a typical table of a relational (SQL) database and is used to manage, for instance, financial time series data. This is what we focus on in this section."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### pandas DataFrame class"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In its most basic form, a ``DataFrame`` object is characterized by an index, column names and tabular data. To make this more specific, consider the following sample data set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "np.random.seed(1000)\n",
    "a = np.random.standard_normal((10, 3)).cumsum(axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also, consider the following dates which shall be our index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "index = ['2016-1-31', '2016-2-28', '2016-3-31',\n",
    "        '2016-4-30', '2016-5-31', '2016-6-30',\n",
    "        '2016-7-31', '2016-8-31', '2016-9-30',\n",
    "        '2016-10-31']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, the column names."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "columns = ['no1', 'no2', 'no3']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The instantiation of a ``DataFrame`` object then looks as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "df = pd.DataFrame(a, index=index, columns=columns)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A look at the new object reveals its resemblence with a typical table from a relational database (or e.g. an Excel spreadsheet)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "DataFrame objects have built in a multitude of basic, advanced and convenience methods, a few of which are illustrated without much commentary below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.head()  # first five rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.tail()  # last five rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.index  # index object"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.columns  # column names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.info()  # meta information"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.describe()  # typical statistics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Numerical operations are in general as easy with ``DataFrame`` objects as with NumPy ``ndarray`` objects. They are also quite close in terms of syntax."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df * 2  # vectorized multiplication"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.std()  # standard deviation by column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.mean(axis=1)  # mean by index value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "np.mean(df)  # mean via universal function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pieces of data can be looked up via different mechanisms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df['no2']  # 2nd column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.iloc[0]  # 1st row"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.iloc[2:4]  # 3rd & 4th row"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.iloc[2:4, 1]  # 3rd & 4th row, 2nd column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.no3.iloc[3:7]  # dot look-up for column name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.loc['2016-3-31']  # row given index value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.loc['2016-5-31', 'no3']  # single data point"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df['no1'] + 3 * df['no3']  # vectorized arithmetic operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Data selections based on Boolean operations are also a strength of pandas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df['no3'] > 0.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df[df['no3'] > 0.5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df[(df.no3 > 0.5) & (df.no2 > -0.25)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df[df.index > '2016-4-30']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "pandas is well integrated with the matplotlib library which makes it really convenient to plot data stored in ``DataFrame`` objects. In general, a single method call does the trick already (see the following figure). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.plot(figsize=(10, 6));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Line plot from pandas DataFrame."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "Histograms are also generated this way (see the following figure). In both cases, pandas takes care of the handling of the single columns and automatically generates single lines (with respective legend entries) and generates respective sub-plots with three different histograms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "df.hist(figsize=(10, 6));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Histograms from pandas DataFrame."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Input-Output Operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another strength of pandas is the exporting and importing of data to and from diverse data storage formats. Consider the case of comma separated value (CSV) files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_csv('data.csv')  # exports to CSV file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us have a look at the just saved file with basic Python functionality."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "with open('data.csv') as f:  # open file\n",
    "    for l in f.readlines():  # iterate over all lines\n",
    "        print(l, end='')  # print line"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Reading data from such files is also straightforward."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "from_csv = pd.read_csv('data.csv',  # filename\n",
    "                      index_col=0,  # index column\n",
    "                      parse_dates=True)  # date index\n",
    "from_csv.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, in general you would store ``DataFrame`` objects on disk in more efficient binary formats like HDF5 (see http://hdfgroup.org). pandas in this case wraps the functionality of the PyTables library (see http://pytables.org). The constructor function to be used is ``HDFStore()``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "h5 = pd.HDFStore('data.h5', 'w')  # open for writing\n",
    "h5['df'] = df  # write object to database\n",
    "h5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Data retrieval is as simple as writing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "from_h5 = h5['df']  # reading from database\n",
    "h5.close()  # closing the database\n",
    "from_h5.tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "!rm data.csv data.h5 # remove the objects from disk"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Financial Analytics Examples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When it comes to financial data, there are useful data retrieval functions available that wrap both the Yahoo! Finance and Google Finance financial data APIs. The following code reads historical daily data for the S&P 500 index and the VIX volatility index.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "url = 'https://hilpisch.com/vola_eikon_eod_data.csv'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "spx = pd.DataFrame(pd.read_csv(url, index_col=0, parse_dates=True)['.SPX'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "spx.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "vix = pd.DataFrame(pd.read_csv(url, index_col=0, parse_dates=True)['.VIX'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "vix.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "vix.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us combine the respective ``Close`` columns into a single ``DataFrame`` object. Muliple ways are possible to accomplish this goal. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "## construction via join\n",
    "spxvix = pd.DataFrame(spx).join(vix)\n",
    "spxvix.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "## construction via merge\n",
    "spxvix = pd.merge(spx, vix,\n",
    "                  left_index=True,  # merge on left index\n",
    "                  right_index=True,  # merge on right index\n",
    "                )\n",
    "spxvix.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In a case like this, the approach via `dictionary` objects might be the best and most intuitive way."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "## construction via dictionary object\n",
    "spxvix = pd.DataFrame({'SPX': spx.values.flatten(),\n",
    "                       'VIX': vix.values.flatten()},\n",
    "                       index=spx.index)\n",
    "spxvix.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Having available the merged data in a single object makes visual analysis straightforward (see the following figure)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "spxvix.plot(figsize=(10, 6), subplots=True, color='b');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Historical closing data for S&P 500 stock index and VIX volatility index."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "pandas also allows vectorized operations on whole ``DataFrame`` objects. The following code calculates the log returns over the two columns of the ``spxvix`` object simulateneously in vectorized fashion. The ``shift()`` method shifts the data set by the number of index values as provided (in this particular by one trading day). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "rets = np.log(spxvix / spxvix.shift(1))\n",
    "rets.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is one row of log returns missing at the very beginning. This row can be deleted via the ``dropna()`` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "rets = rets.dropna()\n",
    "rets.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider the plot in the following figure showing the VIX log returns against the SPX log returns in a scatter plot with a linear regression. It illustrates a strong negative correlation between the two indexes. This is a central result that is replicated in the chapter about the EURO STOXX 50 stock index and the VSTOXX volatility index, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "rets.plot(kind='scatter', x='SPX', y='VIX',\n",
    "              style='.', figsize=(10, 6));\n",
    "rg = np.polyfit(rets['SPX'], rets['VIX'], 1)\n",
    "plt.plot(rets['SPX'], np.polyval(rg, rets['SPX']), 'r.-');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">Daily log returns for S&P 500 stock index and VIX volatility index and regression line."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Having financial time series data stored in a pandas ``DataFrame`` object makes the calculation of typical statistics straightforward."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "ret = rets.mean() * 252  # annualized return\n",
    "ret"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "vol = rets.std() * math.sqrt(252)  # annualized volatility\n",
    "vol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "(ret - 0.01) / vol  # Sharpe ratio with rf = 0.01"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The *maximum drawdown*, which we only calculate for the S&P 500 index, is a bit more involved. For its calculation, we use the ``cummax()`` method which records the running, historical maximum of the time series up to a certain date. Consider the plot in the following figure which shows the S&P 500 index and the running maximum."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 6));\n",
    "spxvix['SPX'].plot(label='S&P 500');\n",
    "spxvix['SPX'].cummax().plot(label='running maximum');\n",
    "plt.legend(loc=0);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"font-family: monospace;\">S&P 500 stock index and running maximum value."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The maximum drawdown is the largest difference between the running maximum and the current index level &mdash; in our particular case it is 1,148."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "adrawdown = spxvix['SPX'].cummax() - spxvix['SPX']\n",
    "adrawdown.max()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The relative maximum drawdown might sometimes be a bit more meaningful. It is here a drawdown of about 34%."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "rdrawdown = (spxvix['SPX'].cummax() - spxvix['SPX']) / spxvix['SPX'].cummax() \n",
    "rdrawdown.max()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The longest drawdown period is calculated as follows. The code below selects all those data points where the drawdown is zero. It then calculates the difference between two consecutive index values (i.e. trading dates) for which the drawdown is zero and takes the maxmimum value. Given the data set we are analyzing, the longest drawdown period is 417 days."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "temp = adrawdown[adrawdown == 0]\n",
    "periods_spx = (temp.index[1:].to_pydatetime()\n",
    "             - temp.index[:-1].to_pydatetime())\n",
    "periods_spx[50:60]  # some selected data points"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "max(periods_spx)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "See Appendix C of Hilpisch (2018) for the handling of date-time information with Python, NumPy and pandas."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This chapter introduces basic data types and structures as well as certain Python idioms needed for analyses in later chapters of the book. In addition, NumPy and the ``ndarray`` class are introduced which allow the efficient handling of and operating on (numerical) data stored as arrays. Some basic visualization techniques using the matplotlib library are also introduced. However, working with pandas and its powerful ``DataFrame`` class for tabular and time series data makes plotting a bit more convenient &mdash; only a single method call is needed in general. Using pandas and the capabilities of the ``DataFrame`` class, the chapter also illustrates by the means of some basic financial examples how to implement typical interactive financial analytics tasks.  "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
